aact_studies.tsvaact_drugs.tsvaact_descriptions.tsvaact_drugs_leadmine.tsvaact_drugs_smi_pubchem_cid.tsvaact_drugs_smi_pubchem_cid2inchi.tsvaact_drugs_inchi2chembl.tsvaact_drugs_chembl_activity_pchembl.tsvaact_drugs_chembl_target_component.tsvpharos_targets.tsvaact_descriptions_tagger_matches.tsvdiseases_entities.tsv
nct_idis the study ID.
## [1] "Wed Apr 3 15:13:13 2019"
library(readr)
library(data.table)
library(plotly, quietly=T)
Read file of all studies in AACT.
## [1] "Total studies: 300214 ; unique NCT_IDs: 300214"
Read file of all drugs in AACT.
id is AACT ID.## [1] "Unique drug names: 91347 ; unique intervention IDs: 255077"
Select only Interventional studies (study_type) associated with drugs (via nct_id).
## [1] "Interventional studies: 237892 (79.2%)"
## [1] "Interventional drug studies: 124421 ; unique NCT_IDs: 124421"
| phase | N_studies | N_drugs |
|---|---|---|
| Early Phase 1 | 1574 | 2615 |
| Phase 1 | 23603 | 48593 |
| Phase 1/Phase 2 | 6663 | 13288 |
| Phase 2 | 33910 | 68850 |
| Phase 2/Phase 3 | 3305 | 6503 |
| Phase 3 | 22988 | 49507 |
| Phase 4 | 19593 | 36331 |
| NA | 12785 | 29390 |
| overall_status | N |
|---|---|
| Completed | 145006 |
| Recruiting | 33973 |
| Terminated | 19618 |
| Unknown status | 18463 |
| Active, not recruiting | 13962 |
| Not yet recruiting | 8001 |
| NA | 7080 |
| Withdrawn | 6969 |
| Enrolling by invitation | 1060 |
| Suspended | 945 |
(To do: stack with study start_year.)
## Warning: Ignoring 1 observations
## Warning: Ignoring 1 observations
AACT drug names resolved to standard names and structures via SMILES. Now we can use cheminformatically rigorous counts for drugs as active pharmaceutical ingredients (APIs).
## [1] "Drug unique SMILES resolved by LeadMine: 4699 ; unique intervention IDs: 171741"
## [1] "Drugs (drug names) with resolved structure: 180555 / 197300 (91.5%)"
## [1] "Mentions by intervention ID: 157862 / 171741 (91.9%)"
## [1] "Mentions by study: 92966 / 99647 (93.3%)"
## [1] "Mentions by drug name: 11108 / 58297 (19.1%)"
## [1] "PubChem SMILES2CID hits: 3960 / 4698 (84.3%)"
## [1] "Intervention IDs mapped to PubChem CIDs (via SMILES): 153876"
## [1] "PubChem CIDs with InChIKeys: 3801"
## [1] "ChEMBL compounds mapped via InChIKeys: 3332"
Select only activities with pChembl values for confidence.
## [1] "ChEMBL activities: 124438"
## [1] "ChEMBL activities molecules: 2287 ; targets: 3832 ; documents: 16198"
## [1] "ChEMBL target proteins: 3157"
## [1] "ChEMBL target proteins mapped to TCRD (human): 1806"
## [1] "Organisms: 187"
| organism | N_targets |
|---|---|
| Homo sapiens | 1806 |
| Rattus norvegicus | 529 |
| Mus musculus | 238 |
| Bos taurus | 98 |
| Sus scrofa | 36 |
| Cavia porcellus | 26 |
| Escherichia coli K-12 | 19 |
| Oryctolagus cuniculus | 18 |
| Escherichia coli | 17 |
| Mycobacterium tuberculosis | 17 |
## [1] "Human targets: 1806"
| target_type | N |
|---|---|
| SINGLE PROTEIN | 1216 |
| PROTEIN COMPLEX | 247 |
| PROTEIN FAMILY | 210 |
| PROTEIN COMPLEX GROUP | 91 |
| PROTEIN-PROTEIN INTERACTION | 16 |
| SELECTIVITY GROUP | 14 |
| CHIMERIC PROTEIN | 12 |
## [1] "Human single-protein targets: 1216 ; unique UniProts: 1216"
## [1] " Tchem: 733" " Tclin: 341" " Tbio: 140"
## [4] " Tdark: 2"
(id) is AACT primary key for detailed_descriptions table. For disease entities, serialno corresponds with DOID.
| doid | N_mentions | terms |
|---|---|---|
| DOID:4 | 76402 | DISEASE;Disease;dis- ease;dis-ease;disease |
| DOID:0111161 | 73734 | CAN;CaN;Can;can |
| DOID:162 | 28596 | CANCER;CANcer;Cancer;Malignant Tumor;Malignant neoplasm;Malignant tumor;Primary Cancer;Primary cancer;cancer;malignant Tumor;malignant neoplasm;malignant tumor;primary cancer |
| DOID:9351 | 17274 | DIABETES;DIABETES MELLITUS;DIAbetes;DIabetes;Diabetes;Diabetes Mellitus;Diabetes mellitus;diabetes;diabetes Mellitus;diabetes mellitus;diabetes-mellitus |
| DOID:6713 | 16632 | CVA;Cerebrovascular Accident;Cerebrovascular Disease;Cerebrovascular accident;Cerebrovascular disease;STROKE;STRokE;Stroke;cerebro- vascular disease;cerebro-vascular disease;cerebrovascular accident;cerebrovascular disease;cerebrovascular disorder;cerebrovascular syndrome;cv-a;cva;stroKe;stroke |
| DOID:2030 | 12084 | ANXIETY;Anxiety;Anxiety Disorder;Anxiety state;anxiety;anxiety disorder;anxiety state;anxiety syndrome;anxiety-state |
| DOID:1612 | 10583 | BREAST CANCER;BReast CAncer;BReast Cancer;Breast Cancer;Breast cancer;Breast tumor;Breast-cancer;Primary breast cancer;breast Cancer;breast caNcEr;breast cancer;breast tumor;breast-cancer;breastcancer;mammary cancer;mammary tumor;primary breast cancer |
| DOID:2841 | 10021 | ASTHMA;Asthma;BHR;Bronchial hyper-reactivity;Bronchial hyperreactivity;EIA;Exercise-induced asthma;asthma;bronchial hyper reactivity;bronchial hyper-reactivity;bronchial hyperreactivity;exercise induced asthma;exercise-induced asthma |
| DOID:3083 | 9782 | CHRONIC OBSTRUCTIVE PULMONARY DISEASE;COLD;COPD;COPd;Chronic Obstructive Lung Disease;Chronic Obstructive Lung disease;Chronic Obstructive Pulmonary Disease;Chronic Obstructive Pulmonary disease;Chronic Obstructive lung Disease;Chronic Obstructive pulmonary Disease;Chronic Obstructive pulmonary disease;Chronic obstructive airway disease;Chronic obstructive lung disease;Chronic obstructive pulmonary disease;Cold;chronic Obstructive Lung Disease;chronic obstructive airway disease;chronic obstructive lung disease;chronic obstructive pulmonary disease;chronic obstructive pulmonary disorder;cold;copd |
| DOID:9970 | 9303 | OBESITY;OBesity;Obesity;obEsity;obe-sity;obesity |
| DOID:10763 | 9144 | HBP;HTN;HYPERTENSION;High Blood Pressure;High blood pressure;High-blood pressure;Hypertension;Hypertensive disease;high blood Pressure;high blood pressure;high blood-pressure;htn;hyper-tension;hypertension;hypertensive disease;hypertensive disorder |
| DOID:3393 | 6816 | C-HD;CAD;CHD;CORONARY ARTERY DISEASE;CORONARY SYNDROME;CORONARY syndrome;ChD;Coronary ARtery DIsease;Coronary Artery Disease;Coronary Disease;Coronary Heart Disease;Coronary Heart disease;Coronary Syndrome;Coronary artery disease;Coronary disease;Coronary heart disease;Coronary-artery-disease;coronary Syndrome;coronary arteriosclerosis;coronary artery dis-ease;coronary artery disease;coronary disease;coronary heart disease;coronary syndrome;coronary-artery disease;coronary-artery-disease |
| DOID:0060145 | 6115 | ANALGESIA;Analgesia;analgeSia;analgesia |
| DOID:0111084 | 5958 | FACE;FaCE;Face;face |
| DOID:9352 | 5848 | Diabetes Mellitus Type 2;Diabetes Mellitus Type II;Diabetes Mellitus type 2;Diabetes Mellitus, Type II;Diabetes mellitus Type 2;Diabetes mellitus non-insulin-dependent;Diabetes mellitus type 2;Diabetes mellitus type II;NIDDM;Non-Insulin Dependent Diabetes Mellitus;Non-Insulin-Dependent-Diabetes Mellitus;Non-insulin dependent diabetes mellitus;Non-insulin-dependent Diabetes Mellitus;Type 2 - Diabetes Mellitus;Type 2 Diabetes;Type 2 Diabetes Mellitus;Type 2 Diabetes mellitus;Type 2 diabetes;Type 2 diabetes mellitus;Type 2-diabetes mellitus;Type II Diabetes;Type II Diabetes Mellitus;Type II Diabetes mellitus;Type II diabetes;Type II diabetes mellitus;Type-2 Diabetes;Type-2 Diabetes Mellitus;Type-2 diabetes;Type-2 diabetes mellitus;Type-2-diabetes;Type-II diabetes;Type2 Diabetes Mellitus;Type2 diabetes;Type2 diabetes mellitus;diabetes mellitus type 2;diabetes mellitus type II;diabetes mellitus type-2;diabetes mellitus type2;diabetes mellitus, type 2;maturity onset diabetes;maturity-onset diabetes;non insulin dependent diabetes mellitus;non insulin-dependent diabetes mellitus;non-insulin dependent diabetes mellitus;non-insulin-dependent diabetes mellitus;noninsulin-dependent diabetes mellitus;type -2 diabetes mellitus;type 2 Diabetes;type 2 Diabetes Mellitus;type 2 diabetes;type 2 diabetes mellitus;type 2-diabetes;type 2diabetes;type 2diabetes mellitus;type II Diabetes;type II Diabetes Mellitus;type II diabetes;type II diabetes mellitus;type II-diabetes;type-2 Diabetes;type-2 diabetes;type-2 diabetes mellitus;type-2-diabetes;type-II diabetes;type-II diabetes mellitus;type-II- diabetes mellitus;type2 diabetes;type2 diabetes mellitus |
| DOID:10283 | 5056 | Familial Prostate Cancer;HPC;PRostate Cancer;Prostate CAncer;Prostate Cancer;Prostate cancer;Prostatic cancer;hereditary prostate cancer;prostate Cancer;prostate cancer;prostate-cancer;prostatic cancer |
| DOID:8469 | 4985 | FLU;Flu;Influenza;flu;influenza |
| DOID:225 | 4962 | SYNDROME;Syndrome;syn drome;syndrome |
| DOID:3908 | 4959 | NSCLC;Non Small Cell Lung Cancer;Non Small Cell Lung Carcinoma;Non Small Cell Lung cancer;Non small cell lung cancer;Non small-cell lung cancer;Non- small cell lung cancer;Non-Small Cell Lung Cancer;Non-Small Cell Lung Carcinoma;Non-Small Cell Lung cancer;Non-Small cell lung cancer;Non-Small- Cell Lung Cancer;Non-Small-Cell Lung Cancer;Non-Small-Cell lung Cancer;Non-small Cell Lung Cancer;Non-small Cell Lung Carcinoma;Non-small cell Lung Cancer;Non-small cell lung cancer;Non-small cell lung carcinoma;Non-small-cell Lung Cancer;Non-small-cell lung cancer;nSCLC;non small cell lung cancer;non small cell lung carcinoma;non small-cell lung cancer;non- small cell lung cancer;non-small Cell Lung Cancer;non-small cell Lung cancer;non-small cell lung Cancer;non-small cell lung cancer;non-small cell lung carcinoma;non-small-cell lung cancer;non-small-cell lung carcinoma;non-small-cell lung-cancer;nonsmall cell lung cancer;nonsmall cell lung cancer;nonsmall- cell lung cancer |
| DOID:784 | 4841 | CKD;CKF;CRD;CRF;Chronic Kidney Disease;Chronic Kidney disease;Chronic Kidney failure;Chronic Renal Disease;Chronic kidney disease;Chronic kidney failure;Chronic renal disease;chronic Kidney disease;chronic kidney disease;chronic kidney failure;chronic renal disease;chronic renal failure syndrome;ckd;crf;renal failure chronic |